Kunoichi-DPO-7B is a model fine-tuned using Intel's Orca data for direct preference optimization (DPO) on the Alpaca template based on the Kunoichi-7B model. It is mainly targeted at general scenarios and has stronger reasoning and instruction-following abilities.
Large Language Model
Transformers